Listing of Claims: 

1 . (Currently amended) A method of identifying molecules for production, wherein the 
molecules are represented by concatenated strings, said method comprising: 

i) encoding two or more biological molecules into a data structure of initial character 
strings to provide a collection of two or more different initial character strings wherein each 
of said biological molecules comprises at least about 10 subunits; 

ii) selecting at least two substrings from said initial character strings; 

iii) concatenating said substrings to form one or more product strings about the same 
length as one or more of the initial character strings; 

iv) adding the product strings to a data structure to populate a data structure of 
product strings; 

v) determining sequence identities of at least one of the product strings relative to at 
least one initial character string; and 

vi) selecting one or more product biological molecules for production, wherein the 
one or more product biological molecules correspond to one or more of the product strings 
having greater than 30% sequence identity with the at least one initial character string. 

2. (Previously presented) The method of claim 1, wherein said encoding 
comprises encoding two or more nucleic acid sequences into said character strings. 

3. (Previously presented) The method of claim 2, wherein said two or more 
nucleic acid sequences comprise a nucleic acid sequence encoding a naturally occurring 
protein. 

4. (Previously presented) The method of claim 1, wherein said encoding 
comprises encoding two or more amino acid sequences into said character strings. 

5. (Previously presented) The method of claim 4, wherein said two or more amino 
acid sequences comprise an amino acid sequence encoding a naturally occurring protein. 

6. (Previously presented) The method of claim 1, wherein said initial character 
strings have at least 30% sequence identity with each other. 
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7. (Previously presented) The method of claim 1, wherein said selecting in (ii) 
comprises selecting at least one substring from an initial character string such that the ends of 
said substring occur in string regions of about 3 to about 20 characters in the initial character 
string that have higher sequence identity with the corresponding region of another of said 
initial character strings than the overall sequence identity between the two initial character 
strings. 

8. (Previously presented) The method of claim 1, wherein said selecting in (ii) 
comprises selecting substrings such that the ends of said substrings occur in predefined 
motifs of about 4 to about 8 characters. 

9. (canceled) 

10. (Previously presented) The method of claim 1, wherein said selecting in (ii) 
comprises aligning two or more of said initial character strings to maximize pairwise identity 
between two or more substrings of the initial character strings, and selecting a character that 
is a member of an aligned pair for the end of one of the two or more substrings. 

1 1 . (canceled) 

12. (Previously presented) The method of claim 1, wherein said method further 
comprises randomly altering one or more characters of said initial or product character 
strings. 

13. (Currently amended) The method of claim 12, wherein said method further 
comprises randomly selecting and altering one or more occurrences of a particular 
preselected character in said initial or product character strings. 

14. (Previously presented) The method of claim 1, wherein said encoding, 
selecting, or concatenating is performed on an internet site. 

15. (Previously presented) The method of claim 1, wherein said encoding, 
selecting, or concatenating is performed on a server. 

16. (Previously presented) The method of claim 1, wherein said encoding, 
selecting, or concatenating is performed on a client linked to a network. 
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17. (Currently Amended) A computer program product on a computer readable 
media comprising computer code that: 

i) encodes two or more biological molecules into initial character strings to provide a 
collection of two or more different initial character strings wherein each of said biological 
molecules comprises at least about ten subunits; 

ii) selects at least two initial substrings from said initial character strings; 

iii) concatenates said substrings to form one or more product strings about the same 
length as one or more of the initial character strings; 

iv) adds the product strings to a data structure to populate a data structure of product 

strings; 

v) determines sequence identities of at least one of the product strings relative to at 
least one initial character string; and 

vi) selects one or more product biological molecules for production, wherein the one 
or more product biological molecules correspond to one or more of the product strings having 
greater than 30% sequence identity with the at least one initial character string. 

18. (Currently amended) The computer program product of claim 17, wherein 
said two or more biological molecules are nucleic acid sequences e ncoding naturally 
occurring prot e ins . 

19. (Previously presented) The computer program product of claim 17, wherein 
said two or more biological molecules are nucleic acid sequences encoding naturally 
occurring proteins. 

20. (Previously presented) The computer program product of claim 17, wherein 
said two or more biological molecules are amino acid sequences. 

21. (Currently amended) The computer program product of claim 17, wherein 
said initial character strings have at least 30% sequence identity with each other . 

22. (Previously presented) The computer program product of claim 17, wherein 
said computer code selects in (ii) at least one substring from an initial character string such 
that the ends of said substring occur in string regions of about three to about twenty 
characters in the initial character string that have higher sequence identity with a 
corresponding region of another of said initial character strings than the overall sequence 
identity between the two initial character substrings. 
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23. (Previously presented) The computer program product of claim 17, wherein 
said computer code selects substrings such that the ends of said substrings occur in 
predefined motifs of about 4 to about 8 characters. 

24. (canceled) 

25. (Previously presented) The computer program product of claim 17, wherein the 
computer code selects substrings by aligning two or more of said initial character strings to 
maximize pairwise identity between two or more substrings of the character strings, and 
selecting a character that is a member of an aligned pair for the end of one substring. 

26. (canceled) 

27. (Currently amended) The computer program product of claim 17, wherein 
said computer code additionally randomly alters one or more characters of said initial or 
product character strings. 

28. (Currently amended) The computer program product of claim 27, wherein 
said computer code additionally randomly selects and alters one or more occurrences of a 
particular preselected character in said initial or product character strings. 

29. (Previously presented) The computer program product of claim 17, wherein 
said computer code is stored on media selected from the group consisting of magnetic media, 
optical media, and optomagnetic media. 

30. (Previously presented) The computer program product of claim 17, wherein 
said computer code is in dynamic or static memory of a computer. 

31-44. (canceled) 

45. (Currently amended) The method of claim 1, wherein the initial character 
strings of (i) are related in that they encode the same gene or protein family but differ in 
sequence . 

46. (canceled) 
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47. (Previously presented) The method of claim 1, further comprising determining 
a computationally predicted property for molecules represented by the product strings. 

48. (Previously presented) The method of claim 1, wherein the molecules 
represented by the product strings are made in parallel in an array of vessels. 

49. (Previously presented) The method of claim 1, wherein the molecules 
represented by the product strings are made by assembly of oligonucleotides. 

50. (canceled) 

5 1 . (Currently amended) The computer program product of claim 1 7, wherein the 
initial character strings of (i) are related in that they encode the same gene or protein family 
but differ in sequence . 

52. (Previously presented) The computer program product of claim 17, wherein the 
code instructs physical screening of the molecule(s) represented by the product strings for 
one or more desired properties. 

53. (Previously presented) The computer program product of claim 17, wherein the 
code instructs determination of a computationally predicted property for molecules 
represented by the product strings. 

54. (Canceled) 

55. (Canceled) 

56. (Previously presented) The computer program product of claim 17, wherein the 
code tests members of the data structure of product strings for a particular property and 
determines sequence differences responsible for differences in the particular property using 
multi-variate analysis. 

57. (Currently amended) A method of identifying molecules for production, 
wherein the molecules are represented by concatenated strings, said method comprising: 

i) encoding two or more related biological molecules into a data structure of initial 
character strings to provide a collection of two or more different initial character strings 
wherein each of said biological molecules comprises at least about 10 subunits; 

ii) selecting at least two substrings from said initial character strings; 
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iii) concatenating said substrings to form one or more product strings; 

iv) adding the product strings to a data structure to populate a data structure of 
product strings; and 

v) determining whether at least one of the product strings have at least a predefined 
measure of similarity with at least one initial character string; and 

vi) selecting one or more product biological molecules for production, wherein the 
one or more product biological molecules correspond to one or more of the product strings 
determined to have greater than the predefined value of sequence identity with at least one 
initial string. 

58. (Previously presented) The method of claim 1, wherein the one or more 
product strings of (vi) have greater than 50% sequence identity with the at least one initial 
character string. 

59. (Previously presented) The method of claim 1, wherein the one or more 
product strings of (vi) have greater than 75% sequence identity with the at least one initial 
character string. 

60. (Previously presented) The method of claim 1 , wherein the one or more 
product strings of (vi) have greater than 85% sequence identity with the at least one initial 
character string. 

61 . (Previously presented) The method of claim 1, wherein the one or more 
product strings of (vi) have greater than 90% sequence identity with the at least one initial 
character string. 

62. (Previously presented) The method of claim 1, wherein the one or more 
product strings of (vi) have greater than 95% sequence identity with the at least one initial 
character string. 

63. (Previously presented) The computer program product of claim 17, wherein the 
one or more product strings of (vi) having greater than 50% sequence identity with the at 
least one initial character string. 

64. (Previously presented) The computer program product of claim 17, wherein the 
one or more product strings of (vi) having greater than 75% sequence identity with the at 
least one initial character string. 
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65. (Previously presented) The computer program product of claim 17, wherein the 
one or more product strings of (vi) having greater than 95% sequence identity with the at 
least one initial character string. 

66. (Currently amended) A method of identifying molecules for production, 
wherein the molecules are represented by concatenated strings, said method comprising: 

i) encoding two or more biological molecules into a data structure of initial character 
strings to provide a collection of two or more different initial character strings wherein each 
of said biological molecules comprises at least about 10 subunits; 

ii) selecting at least two substrings from said initial character strings; 

iii) concatenating said substrings to form one or more product strings about the same 
length as one or more of the initial character strings; 

iv) adding the product strings to a data structure to populate a data structure of 
product strings; 

v) providing an alignment of at least one of the product strings relative to at least one 
initial character string ; and 

vi) selecting one or more product biological molecules for production, wherein the 
one or more product biological molecules correspond to one or more of the product strings 
having greater than 30% sequence identity with the at least one initial character string. 

67. (Previously presented) The method of claim 66, wherein said encoding 
comprises encoding two or more amino acid sequences into said character strings, and 
wherein said two or more amino acid sequences comprise an amino acid sequence encoding a 
naturally occurring protein. 

68. (Previously presented) The method of claim 66, wherein said initial character 
strings have at least 30% sequence identity with each other. 

69. (Previously presented) The method of claim 66, wherein said selecting in (ii) 
comprises selecting at least one substring from an initial character string such that the ends of 
said substring occur in string regions of about 3 to about 20 characters in the initial character 
string that have higher sequence identity with the corresponding region of another of said 
initial character strings than the overall sequence identity between the two initial character 
strings. 
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70. (Previously presented) The method of claim 66, wherein said selecting in (ii) 
comprises selecting substrings such that the ends of said substrings occur in predefined 
motifs of about 4 to about 8 characters. 

71. (Previously presented) The method of claim 66, wherein said selecting in (ii) 
comprises aligning two or more of said initial character strings to maximize pairwise identity 
between two or more substrings of the initial character strings, and selecting a character that 
is a member of an aligned pair for the end of one of the two or more substrings. 

72. (Previously presented) The method of claim 66, wherein said method further 
comprises randomly altering one or more characters of said initial or product character 
strings. 

73. (Previously presented) The method of claim 66, wherein the one or more 
product strings of (vi) having greater than 50% sequence identity with the at least one initial 
character string. 

74. (Previously presented) The method of claim 66, wherein the one or more 
product strings of (vi) having greater than 75% sequence identity with the at least one initial 
character string. 

75. (Previously presented) The method of claim 66, wherein the one or more 
product strings of (vi) having greater than 85% sequence identity with the at least one initial 
character string. 

76. (Previously presented) The method of claim 66, wherein the one or more 
product strings of (vi) having greater than 90% sequence identity with the at least one initial 
character string. 

77. (Previously presented) The method of claim 66, wherein the one or more 
product strings of (vi) having greater than 95% sequence identity with the at least one initial 
character string. • 

78. (Currently amended) A computer program product on a computer readable 
media comprising computer code that: 

i) encodes two or more biological molecules into initial character strings to provide a 
collection of two or more different initial character strings wherein each of said biological 
molecules comprises at least about ten subunits; 
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ii) selects at least two initial substrings from said initial character strings; 

iii) concatenates said substrings to form one or more product strings about the same 
length as one or more of the initial character strings; 

iv) adds the product strings to a data structure to populate a data structure of product 

strings; 

v) provides an alignment of at least one of the product strings relative to at least one 
initial character string ; and 

vi) selects one or more product biological molecules for production, wherein the one 
or more product biological molecules correspond to one or more of the product strings having 
greater than 30% sequence identity with the at least one initial character string. 

79. (Previously presented) The computer program product of claim 78, wherein 
said computer code encodes two or more amino acid sequences into said character strings, 
and wherein said two or more amino acid sequences comprise an amino acid sequence 
encoding a naturally occurring protein. 

80. (Previously presented) The computer program product of claim 78, wherein 
said initial character strings have at least 30% sequence identity with each other. 

81 . (Previously presented) The computer program product of claim 78, wherein 
said computer code selects in (ii) at least one substring from an initial character string such 
that the ends of said substring occur in string regions of about three to about twenty 
characters in the initial character string that have higher sequence identity with a 
corresponding region of another of said initial character strings than the overall sequence 
identity between the two initial character substrings. 

82. (Previously presented) The computer program product of claim 78, wherein 
said computer code selects in (ii) by selecting substrings such that the ends of said substrings 
occur in predefined motifs of about 4 to about 8 characters. 

83. (Previously presented) The computer program product of claim 78, wherein 
said computer code selects in (ii) by aligning two or more of said initial character strings to 
maximize pairwise identity between two or more substrings of the initial character strings, 
and selecting a character that is a member of an aligned pair for the end of one of the two or 
more substrings. 
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84. (Previously presented) The computer program product of claim 78, wherein 
said computer code further randomly alters one or more characters of said initial or product 
character strings. 

85. (Previously presented) The computer program product of claim 78, wherein the 
one or more product strings of (vi) having greater than 50% sequence identity with the at 
least one initial character string. 

86. (Previously presented) The computer program product of claim 78, wherein the 
one or more product strings of (vi) having greater than 75% sequence identity with the at 
least one initial character string. 

87. (Previously presented) The computer program product of claim 78, wherein the 
one or more product strings of (vi) having greater than 85% sequence identity with the at 
least one initial character string. 

88. (Previously presented) The computer program product of claim 78, wherein the 
one or more product strings of (vi) having greater than 90% sequence identity with the at 
least one initial character string. 

89. (Previously presented) The computer program product of claim 78, wherein the 
one or more product strings of (vi) having greater than 95% sequence identity with the at 
least one initial character string. 

90. (Previously presented) A method of identifying molecules for production, 
wherein the molecules are represented by concatenated strings, said method comprising: 

i) encoding two or more naturally occurring biological molecules into a data structure 
of initial character strings to provide a collection of two or more different initial character 
strings wherein each of said biological molecules comprises at least about 10 subunits; 

ii) selecting at least two substrings from said initial character strings; 

iii) concatenating said substrings to form one or more product strings about the same 
length as one or more of the initial character strings; 

iv) adding the product strings to a data structure to populate a data structure of 
product strings; and 

v) selecting one or more product biological molecules for production, wherein the 
one or more product biological molecules correspond to one or more of the product strings 
having greater than 30% sequence identity with the at least one initial character string. 
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91 . (Previously presented) The method of claim 90, wherein said encoding 
comprises encoding two or more nucleic acid sequences into said character strings. 

92. (Previously presented) The method of claim 90, wherein said encoding 
comprises encoding two or more amino acid sequences into said character strings, and 
wherein said two or more amino acid sequences comprise an amino acid sequence encoding a 
naturally occurring protein. 

93. (Previously presented) The method of claim 90, wherein said initial character 
strings have at least 30% sequence identity with each other. 

94. (Previously presented) The method of claim 90, wherein said selecting in (ii) 
comprises selecting at least one substring from an initial character string such that the ends of 
said substring occur in string regions of about 3 to about 20 characters in the initial character 
string that have higher sequence identity with the corresponding region of another of said 
initial character strings than the overall sequence identity between the two initial character 
strings. 

95. (Previously presented) The method of claim 90, wherein said selecting in (ii) 
comprises selecting substrings such that the ends of said substrings occur in predefined 
motifs of about 4 to about 8 characters. 

96. (Previously presented) . The method of claim 90, wherein said selecting in (ii) 
comprises aligning two or more of said initial character strings to maximize pairwise identity 
between two or more substrings of the initial character strings, and selecting a character that 
is a member of an aligned pair for the end of one of the two or more substrings. 

97. (Previously presented) The method of claim 90, wherein said method further 
comprises randomly altering one or more characters of said initial or product character 
strings. 

98. (Previously presented) The method of claim 90, wherein the one or more 
product strings of (v) having greater than 50% sequence identity with the at least one initial 
character string. 
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99. (Previously presented) The method of claim 90, wherein the one or more 
product strings of (v) having greater than 75% sequence identity with the at least one initial 
character string. 

100. (Previously presented) The method of claim 90, wherein the one or more 
product strings of (v) having greater than 85% sequence identity with the at least one initial 
character string. 

101. (Previously presented) The method of claim 90, wherein the one or more 
product strings of (v) having greater than 90% sequence identity with the at least one initial 
character string. 

102. (Previously presented) The method of claim 90, wherein the one or more 
product strings of (v) having greater than 95% sequence identity with the at least one initial 
character string. 

103. (Currently amended) A computer program product on a computer readable 
media comprising computer code that: 

i) encodes two or more naturally occurring biological molecules into initial character 
strings to provide a collection of two or more different initial character strings wherein each 
of said biological molecules comprises at least about ten subunits; 

ii) selects at least two initial substrings from said initial character strings; 

iii) concatenates said substrings to form one or more product strings about the same 
length as one or more of the initial character strings; 

iv) adds the product strings to a data structure to populate a data structure of product 
strings; and 

v) selects one or more product biological molecules for production, wherein the one 
or more product biological molecules correspond to one or more of the product strings having 
greater than 30% sequence identity with the at least one initial character string. 

104. (Previously presented) The computer program product of claim 103, wherein 
said computer code encodes by encoding two or more nucleic acid sequences into said 
character strings. 

105. (Previously presented) The computer program product of claim 103, wherein 
said computer code encodes two or more amino acid sequences into said character strings, 
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and wherein said two or more amino acid sequences comprise an amino acid sequence 
encoding a naturally occurring protein. 

106. (Previously presented) The computer program product of claim 103, wherein 
said initial character strings have at least 30% sequence identity with each other. 

107. (Previously presented) The computer program product of claim 103, wherein 
said computer code selects in (ii) at least one substring from an initial character string such 
that the ends of said substring occur in string regions of about three to about twenty 
characters in the initial character string that have higher sequence identity with a 
corresponding region of another of said initial character strings than the overall sequence 
identity between the two initial character substrings. 

108. (Previously presented) The computer program product of claim 103, wherein 
said computer code selects in (ii) by selecting substrings such that the ends of said substrings 
occur in predefined motifs of about 4 to about 8 characters. 

109. (Previously presented) The computer program product of claim 103, wherein 
said computer code selects in (ii) by aligning two or more of said initial character strings to 
maximize pairwise identity between two or more substrings of the initial character strings, 
and selecting a character that is a member of an aligned pair for the end of one of the two or 
more substrings. 

110. (Previously presented) The computer program product of claim 103, wherein 
said computer code further randomly alters one or more characters of said initial or product 
character strings. 

111. (Previously presented) The computer program product of claim 1 03, wherein 
the one or more product strings of (v) having greater than 50% sequence identity with the at 
least one initial character string. 

112. (Previously presented) The computer program product of claim 103, wherein 
the one or more product strings of (v) having greater than 75% sequence identity with the at 
least one initial character string. 

113. (Previously presented) The computer program product of claim 103, wherein 
the one or more product strings of (v) having greater than 85% sequence identity with the at 
least one initial character string. 
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1 14. (Previously presented) The computer program product of claim 103, wherein 
the one or more product strings of (v) having greater than 90% sequence identity with the at 
least one initial character string. 

115. (Previously presented) The computer program product of claim 103, wherein 
the one or more product strings of (v) having greater than 95% sequence identity with the at 
least one initial character string. 

116. (Currently amended) A method of identifying molecules for production, 
wherein the molecules are represented by concatenated strings, said method comprising: 

i) encoding two or more biological molecules into a data structure of initial character 
strings to provide a collection of two or more different initial character strings wherein each 
of said biological molecules comprises at least about 10 subunits; 

ii) selecting at least two substrings from said initial character strings; 

iii) concatenating said substrings to form one or more product strings about the same 
length as one or more of the initial character strings; 

iv) adding the product strings to a data structure to populate a data structure of 
product strings; 

v) obtaining one or more computationally predicted properties for at least one of the 
product strings in the data structure; and 

vi) selecting one or more product biological molecules for production on the basis of 
the one or more computationally predicted properties. 

117. (Previously presented) The method of claim 116, wherein the computationally 
predicted properties comprise one or more of a maximum or minimum molecular weight, a 
maximum or minimum free energy, a maximum or minimum contact surface with a target 
molecule or surface, a specified net charge, a predicted pK, a predicted pi, a binding avidity, 
secondary form, and tertiary form. 

118. (Previously presented) The method of claim 116, wherein said encoding 
comprises encoding two or more amino acid sequences into said character strings. 

1 19. (Previously presented) The method of claim 116, wherein said selecting in (ii) 
comprises aligning two or more of said initial character strings to maximize pairwise identity 
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between two or more substrings of the initial character strings, and selecting a character that 
is a member of an aligned pair for the end of one of the two or more substrings. 

120. (Previously presented) The method of claim 116, wherein said method further 
comprises randomly altering one or more characters of said initial or product character 
strings. 

121. (Previously presented) The method of claim 116, wherein the one or more 
product biological molecules of (vi) having greater than 50% sequence identity with the at 
least one initial character string. 

122. (Previously presented) The method of claim 116, wherein the one or more 
product biological molecules of (vi) having greater than 75% sequence identity with the at 
least one initial character string. 

123. (Previously presented) The method of claim 116, wherein the one or more 
product biological molecules of (vi) having greater than 90% sequence identity with the at 
least one initial character string. 

124. (Currently amended) A computer program product on a computer readable 
media comprising computer code that: 

i) encodes two or more biological molecules into initial character strings to provide a 
collection of two or more different initial character strings wherein each of said biological 
molecules comprises at least about ten subunits; 

ii) selects at least two initial substrings from said initial character strings; 

'iii) concatenates said substrings to form one or more product strings about the same 
length as one or more of the initial character strings; 

iv) adds the product strings to a data structure to populate a data structure of product 

strings; 

v) obtains one or more computationally predicted properties for at least one of the 
product strings in the data structure; and 

vi) selects one or more product biological molecules for production on the basis of 
the one or more computationally predicted properties. 

125. (Previously presented) The computer program product of claim 124, wherein 
the computationally predicted properties comprise one or more of a maximum or minimum 
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molecular weight, a maximum or minimum free energy, a maximum or minimum contact 
surface with a target molecule or surface, a specified net charge, a predicted pK, a predicted 
pi, a binding avidity, secondary form, and tertiary form. 

126. (Previously presented) The computer program product of claim 124, wherein 
the computer code encodes in (i) by encoding two or more amino acid sequences into said 
character strings. 

127. (Previously presented) The computer program product of claim 124, wherein 
the computer code selects in (ii) by aligning two or more of said initial character strings to 
maximize pairwise identity between two or more substrings of the initial character strings, 
and selecting a character that is a member of an aligned pair for the end of one of the two or 
more substrings. 

128. (Previously presented) The computer program product of claim 124, wherein 
the computer code further randomly alters one or more characters of said initial or product 
character strings. 

129. (Previously presented) The computer program product of claim 124, wherein 
the one or more product biological molecules of (vi) having greater than 50% sequence 
identity with the at least one initial character string. 

130. (Previously presented) The computer program product of claim 124, wherein 
the one or more product biological molecules of (vi) having greater than 75% sequence 
identity with the at least one initial character string. 

131. (Previously presented) The computer program product of claim 124, wherein 
the one or more product biological molecules of (vi) having greater than 90% sequence 
identity with the at least one initial character string. - - 

132. (Currently amended) The method of claim 1, wherein adding the product 
strings to a data structure comprises adding more than one product strings string to the data 
structure. 

133. (Previously presented) The method of claim 1, wherein selecting at least two 
substrings from said initial character strings comprises random substring selection. 
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134. (Previously presented) The method of claim 1, wherein selecting at least two 
substrings from said initial character strings comprises uniform substring selection. 

135. (Previously presented) The method of claim 1, wherein selecting at least two 
substrings from said initial character strings comprises motif-based selection. 

136. (Previously presented) The method of claim 1, wherein selecting at least two 
substrings from said initial character strings comprises alignment-based selection. 

137. (Previously presented) The method of claim 1, wherein selecting at least two 
substrings from said initial character strings comprises frequency-biased selection. 

138. (Currently amended) The computer program product of claim 17, wherein the 
computer code adds the product strings to a data structure by adding more than one product 
strings string to the data structure. 

139. (Previously presented) The computer program product of claim 17, wherein the 
computer code selects at least two substrings from said initial character strings by a random 
substring selection. 

140. (Previously presented) The computer program product of claim 17, wherein the 
computer code selects at least two substrings from said initial character strings by a uniform 
substring selection. 

141. (Previously presented) The computer program product of claim 17, wherein the 
computer code selects at least two substrings from said initial character strings by a motif- 
based selection. 

142. (Previously presented) The computer program product of claim 17, wherein the 
computer code selects at least two substrings from said initial character strings by an 
alignment-based selection. 
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143. (Previously presented) The computer program product of claim 17, wherein the 
computer code selects at least two substrings from said initial character strings by a 
frequency-biased selection. 

144. (Currently amended) The method of claim 66, wherein adding the product 
strings to a data structure comprises adding more than one product strings string to the data 
structure. 

145. (Previously presented) The method of claim 66, wherein selecting at least two 
substrings from said initial character strings comprises random substring selection. 

146. (Previously presented) The method of claim 66, wherein selecting at least two 
substrings from said initial character strings comprises uniform substring selection. 

147. (Previously presented) The method of claim 66, wherein selecting at least two 
substrings from said initial character strings comprises motif-based selection. 

148. (Previously presented) The method of claim 66, wherein selecting at least two 
substrings from said initial character strings comprises alignment-based selection. 

149. (Previously presented) The method of claim 66, wherein selecting at least two 
substrings from said initial character strings comprises frequency-biased selection. 

150. (Currently amended) The computer program product of claim 78, wherein the 
computer code adds the product strings to a data structure by adding more than one product 
strings string to the data structure. 

151. (Previously presented) The computer program product of claim 78, wherein the 
computer code selects at least two substrings from said initial character strings by a random 
substring selection. 

152. (Previously presented) The computer program product of claim 78, wherein the 
computer code selects at least two substrings from said initial character strings by a uniform 
substring selection. 

P.O. Box 778 32 
Berkeley, CA 94704-0778 



153. (Previously presented) The computer program product of claim 78, wherein the 
computer code selects at least two substrings from said initial character strings by a motif- 
based selection. 

1 54. (Previously presented) The computer program product of claim 78, wherein the 
computer code selects at least two substrings from said initial character strings by an 
alignment-based selection. 

155. (Previously presented) The computer program product of claim 78, wherein the 
computer code selects at least two substrings from said initial character strings by a 
frequency-biased selection. 

156. (Currently amended) The method of claim 90, wherein adding the product 
strings to a data structure comprises adding more than one product strings string to the data 
structure. 

157. (Previously presented) The method of claim 90, wherein selecting at least two 
substrings from said initial character strings comprises random substring selection. 

158. (Previously presented) The method of claim 90, wherein selecting at least two 
substrings from said initial character strings comprises uniform substring selection. 

159. (Previously presented) The method of claim 90, wherein selecting at least two 
substrings from said initial character strings comprises motif-based selection. 

160. (Previously presented) The method of claim 90, wherein selecting at least two 
substrings from said initial character strings comprises alignment-based selection. 

161 . (Previously presented) The method of claim 90, wherein selecting at least two 
substrings from said initial character strings comprises frequency-biased selection. 
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162. (Currently amended) The computer program product of claim 103, wherein 
the computer code adds the product strings to a data structure by adding more than one 
product strings string to the data structure. 

1 63 . (Previously presented) The computer program product of claim 1 03 , wherein 
the computer code selects at least two substrings from said initial character strings by a 
random substring selection. 

164. (Previously presented) The computer program product of claim 103, wherein 
the computer code selects at least two substrings from said initial character strings by a 
uniform substring selection. 

165. (Previously presented) The computer program product of claim 1 03 , wherein 
the computer code selects at least two substrings from said initial character strings by a motif- 
based selection. 

166. (Previously presented) The computer program product of claim 103, wherein 
the computer code selects at least two substrings from said initial character strings by an 
alignment-based selection. 

167. (Previously presented) The computer program product of claim 103, wherein 
the computer code selects at least two substrings from said initial character strings by a 
frequency-biased selection. 

168. (Currently amended) The method of claim 116, wherein adding the product 
strings to a data structure comprises adding more than one-product strings string to the data - 
structure. 

169. (Previously presented) The method of claim 116, wherein selecting at least two 
substrings from said initial character strings comprises random substring selection. 

170. (Previously presented) The method of claim 116, wherein selecting at least two 
substrings from said initial character strings comprises uniform substring selection. 
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171. (Previously presented) The method of claim 1 1 6, wherein selecting at least two 
substrings from said initial character strings comprises motif-based selection. 

172. (Previously presented) The method of claim 116, wherein selecting at least two 
substrings from said initial character strings comprises alignment-based selection. 

173. (Previously presented) The method of claim 116, wherein selecting at least two 
substrings from said initial character strings comprises frequency-biased selection. 

174. (Currently amended) The computer program product of claim 124, wherein 
the computer code adds the product strings to a data structure by adding more than one 
product strings string to the data structure. 

175. (Previously presented) The computer program product of claim 124, wherein 
the computer code selects at least two substrings from said initial character strings by a 
random substring selection. 

176. (Previously presented) The computer program product of claim 124, wherein 
the computer code selects at least two substrings from said initial character strings by a 
uniform substring selection. 

177. (Previously presented) The computer program product of claim 124, wherein 
the computer code selects at least two substrings from said initial character strings by a motif- 
based selection. 

1 78. (Previously presented) The computer program product of claim 124; wherein 
the computer code selects at least two substrings from said initial character strings by an 
alignment-based selection. 

179. (Previously presented) The computer program product of claim 124, wherein 
the computer code selects at least two substrings from said initial character strings by a 
frequency-biased selection. 
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